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ABSTRACT 

We study galaxy clustering in the framework of halo models, where gravitational clustering 
is described in terms of dark matter halos. At small scales, dark matter clustering statistics 
are dominated by halo density profiles, whereas at large scales, correlations are the result of 
combining non-linear perturbation theory with halo biasing. Galaxies are assumed to follow the 
dark matter profiles of the halo they inhabit, and galaxy formation efficiency is characterized 
by the number of galaxies that populate a halo of given mass. This approach leads to generic 
predictions: the galaxy power spectrum shows a power-law behavior even though the dark matter 
does not, and the galaxy higher-order correlations show smaller amplitudes at small scales than 
their dark matter counterparts. Both are in qualitatively agreement with measurements in galaxy 
catalogs. We find that requiring the model to fit both the second and third order moments of 
the APM galaxies provides a strong constraint on galaxy formation models. The data at large 
scales require that galaxy formation be relatively efficient at small masses, m ~ 10 10 Mq//i, 
whereas data at smaller scales require that the number of galaxies in a halo scale approximately 
as the mass to the 0.8th power in the high-mass limit. These constraints are independent of 
those derived from the luminosity function or Tully-Fisher relation. We also predict the power 
spectrum, bispectrum, and higher-order moments of the mass density field in this framework. 
Although halo models agree well with measurements of the mass power spectrum and the higher 
order S p parameters in N-body simulations, the model assumption that halos are spherical leads 
to disagreement in the configuration dependence of the bispectrum at small scales. We stress the 
importance of finite volume effects in higher-order statistics and show how they can be estimated 
in this approach. 
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1. Introduction 

Understanding galaxy clustering is one of the main goals of cosmology. The wealth of information 
provided by galaxy surveys can only be used to extract useful cosmological information if we understand 
the relation between galaxy and dark matter clustering — biasing. On large enough scales, galaxy biasing 
can be described as a local process, and so galaxy clustering can be used as a direct probe of the primordial 
spectrum and Gaussianity of initial conditions (Fry & Gaztanaga 1993; Frieman & Gaztanaga 1994; Fry 
1994; Fry & Scherrer 1994; Gaztanaga & Mahonen 1996; Matarrese, Verde & Heavens 1997; Gaztanaga & 
Fosalba 1998; Frieman & Gaztanaga 1999; Scoccimarro et al. 2000; Durrer et al. 2000). On smaller scales, 
however, non-negligible contributions from complex astrophysical processes relevant to galaxy formation may 
complicate the description of galaxy biasing. 

Galaxy formation is not yet understood from first physical principles. However, following White & Rees 
(1978) and White & Frenk (1991), a number of prescriptions based on reasonable recipes for approximating 
the complicated physics have been proposed for incorporating galaxy formation into numerical simulations 
of dark matter gravitational clustering (see, e.g., Kauffmann et al. 1999, Somerville & Primack 1999 or 
Benson et al. 2000 for some of the most recent work). These "semianalytic galaxy formation" schemes can 
provide detailed predictions for galaxy properties in hierarchical structure formation models, which can then 
be compared with observations. 

The basic assumption in the semianalytic approach is that galaxy biasing can be described as a two- 
step process. First, the formation and clustering of dark matter halos can be modeled by neglecting non- 
gravitational effects.^ This can be done reasonably accurately following the analytic results of Mo & White 
(1996), Mo, Jing & White (1997) and Sheth & Lemson (1999). Second, the distribution of galaxies within 
halos, which in principle depends on complicated physics, can be described by a number of simplifying 
assumptions regarding gas cooling and feedback effects from supernova. For the purposes of this paper, the 
main outcome of this second step is the number of galaxies that populate a halo of a given mass, iV ga i(m). 

In this paper we consider the problem of galaxy clustering from a complementary point of view to 
semianalytic models. We construct clustering statistics from properties of dark matter halos and the N ga ,i(m) 
relation, and show how these simple ingredients can be put together to make reasonably accurate analytic 
predictions about the galaxy power spectrum, bispectrum, and higher-order moments of the galaxy field. 
We also consider the inverse problem: we show how measurements of galaxy clustering can constrain the 
^Vgal(^) relation. We show in particular that the variance and skewness of the galaxy distribution in the 
APM survey provide significant constraints on the N ga \(m) relation. 

Our approach to gravitational clustering has a long history, dating back to Neyman, & Scott (1952), 
and then explored further by Peebles (1974), and McClelland & Silk (1977a,b;1978). These works considered 
perturbations described by halos of a given size and profile, but distributed at random. A complete treatment 
which includes the effects of halo-halo correlations was first given by Scherrer & Bertschinger (1991). Recent 
work has focused on applications of this formalism to the clustering of dark matter, e.g. the small-scale 
behavior of the two-point correlation function (Sheth & Jain 1997), the power spectrum (Seljak 2000; Peacock 
& Smith 2000; Ma & Fry 2000; Cooray & Hu 2000), and the bispectrum for equilateral configurations (Ma 
& Fry 2000; Cooray & Hu 2000). Interest in this approach has been undoubtedly sparked by recent results 
from numerical simulations on the properties of dark matter halos (Navarro, Frenk, & White 1996, 1997; 



4 Halos here are defined in the sense of Press & Schechter (1974). There is not necessarily a one-to-one correspondence 
between halos and galaxies. 
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Bullock et al. 1999; Moore et al. 1999). 

This paper is organized as follows. In Section 2 we review the halo model formalism for the power 
spectrum, bispectrum and higher-order moments of the smoothed density field, and compare the predictions 
with numerical simulations in Section 3. We discuss in detail the role of finite volume effects which, if 
neglected, can lead to incorrect conclusions regarding higher-order statistics. In Section 4, we use the 
N ga i(m) relation to make predictions for galaxy, rather than dark matter, clustering statistics (also see Jing, 
Mo & Borner 1998; Seljak 2000; Peacock & Smith 2000). To illustrate our approach, we use the N ga \(m) 
relation obtained from the semianalytic models of Kauffmann et al. (1999). In Section 5 we discuss the 
constraints on N ga i(m) derived from analysis of counts- in-cells of the APM survey. Section 6 summarizes 
our conclusions. 



2. Dark Matter Clustering 
2.1. Formalism 

In this section we follow the formalism developed by Scherrer & Bertschinger (1991). The dark matter 
density field is written as 

p(x) = ^ f(x — Xi, rrii) = ^ rrii u(x — Xi,mi) = ^ / dmd 3 x'S(m — mi)5 3 (x' — Xi)mu(x — x' , to), (1) 

i i i 

where / denotes the density profile of a halo of mass rm located at position a;,. The mean density is 

„ = <*) ) - (5> ,■(* - ) . jn^j^u^ - *'), (2) 

where we have replaced the ensemble average by the average over the mass function n(m) (which gives the 
density of halos per unit mass) and an average over space i.e. 6(m — mi)S 3 (x' — x,)) = n(m). Our 
normalization convention is such that / d 3 x'u m (x — x') = l and J n(m)mdm = p. The two-point correlation 
function can be written as 

p 2 £(x — x') = J n(m)m 2 dm J d 3 yu m (y)u m (y + x — x') + J n{m\)midm\ j n(TO 2 )TO2dTO 2 

x J d 3 xiu mi (x - x\) J d 3 x 2 u„ l2 (x' - x 2 ) - x 2 ;m 1 ,m 2 ), (3) 

where the first term describes the case where the two particles occupy the same halo, and the second 
term represents the contribution of particles in different halos, with £(x — x';mi,m 2 ) being the two-point 
correlation function of halos of mass toi and m 2 . Since we are dealing with convolutions of halo profiles, it is 
much easier to work in Fourier space, where expressions become multiplications over the Fourier transform 
of halo profiles. We use the following Fourier space conventions: 

A{ - k) = J J^ e M-ik-x)A{x), (4) 
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and 

(5(k 1 )5(k 2 ))=S B (k 12 )P(k 1 ), (5) 
( 5{k 1 )S{k 2 )S{k 3 ) ) = S B (k 123 )B 123 , (6) 

where P(k) and B 123 = B(k\,k 2 ,k 3 ) denote the power spectrum and bispectrum, respectively. Thus, the 
power spectrum reads 



p 2 P(k) = (2n) 3 j n(m)m 2 dm\u m (k)\ 2 + (2n) 6 j u mi (k)n(mi)midmi J u m2 (k)n(m 2 )m 2 dm 2 P(k;mi,m 2 ), 

(7) 

where P(k; m\,m 2 ) represents the power spectrum of halos of mass mi and m 2 . Similarly, the bispectrum 
is given by 

P 3 B 123 = (27r) 3 J n(m)m 3 dm n 3 =1 u m (ki) + (2ir) 6 J u mi (k 1 )n(m 1 )m 1 dm 1 J u m2 (k 2 )u m2 (k 3 )n(m 2 )m 2 l drn 2 

3 . 

xP(k 1 ;mi,m 2 ) + eye. + (2vr) 9 (]^[ / u mi (k i )n(mi)m i dm i ^B 123 (mi,m 2 ,rn 3 ), (8) 

»=i J 

where B\ 23 {mi, m 2 , m 3 ) denotes the bispectrum of halos of mass mi,m 2} m 3 . So far the treatment has been 
completely general. To make the model quantitative, we must specify the halo profile u m (x), the halo mass 
function n(m) and the halo-halo correlations encoded in P(k;mi,m 2 ), Bi 23 (m\,m 2 ,m 3 ), etc. 

2.2. Halo Profiles 

For the halo profile we use (Navarro, Frenk & White 1997; hereafter NFW) 

fc 3 1 

UR{r) = 4TTR 3 mr cr/R mr (l + cr/R mr )^ (9) 

where / = l/[ln(l + c) — c/(l + c)], R v i r is the virial radius of the halo, related to its mass by m = 
(4'irR 3 lir /3)Ap, where A = 200, 340 for an O = 1, 0.3 universe, respectively. We will also use the Lagrangian 
radius R (the initial radius where the mass m came from) to specify halo sizes; R = i? t ,i r A 1/ ' 3 . It is 
convenient to work in units of the characteristic non-linear mass to* or the equivalent scale i?* (defined 
such that er(i?*) = <5 C ; note that to* = (47ri? 3 /3)p). Since to = 1.16 x lO 12 fi(i?h/Mpc) 3 M /h, for ACDM 
(n = 0.3, fi A = 0.7) with a s = 0.90, R* = 3.135 Mpc/h, so to* = 1.07 x 10 13 M Q /h. The Fourier transform 
of the halo profile reads: 

d3x i u \ i \ 1 
_exp(-tfc. aJ )« fl (r) = ^ r) 



u n( k )= I 77^3 exp(— ife • x)un(r) = ——^u(k, y), (10) 



where y = R/R*, k = fci?*A 1 / 3 , and 



- 5 - 



u(k,y) = f sinK(Si[K(l + c)] - Si(/e)) + cos k(Ci[k(1 + c)] - Ci(«)) 



sin(Kc) 



«(1 + c) 



(11) 



where k = ky/c, Si(x) = f^dt sin(£)/i is the sine integral and Ci(ar) = — dt cos(t)/t is the cosine 
integral function. The concentration parameter quantifies the transition from the inner to the outer slope in 
the NFW profile. For the concentration parameter of halos, we use the result (Bullock et al. 1999) 



c(m) fa 9 



(-) 



-0.13 



(12) 



Note that halos defined as above are somewhat different from the prescription in NFW, which defines 
R V i r as the scale within which the mean enclosed density is 200 times the critical density, independent of 
cosmology. Both approaches agree for f2 = 1, whereas for the model we use fl = 0.3 and thus our virial 
radius is defined at the scale where the mean enclosed density density is 340 x 0.3 ps 100 times the critical 
density. Thus, our characteristic density is smaller than NFW's by a factor of two, and our virial radius 
is correspondingly larger. These two effects partially cancel and lead to a very similar prediction for halo 
profiles (taking also into account that the concentration parameters are slightly different as well). 



2.3. Mass Function 

The mass function in normalized units reads 

dy dy I OLir f \ 

n(ra) m dm = p — h(y) = p — 7 A\ 1 1 + (av 2 )^ p ) exp(— a^ 2 /2), (13) 

y y V 2n V / 

where v = S c /a with 5 C = 1.68 the colapse threshold given by the spherical collapse model, A = 0.5, 0.322, 
p = 0,0.3 and a = 1,0.707 for the PS (Press & Schechter 1974) and ST (Shcth & Tormcn 1999) mass 
function, respectively (see Jenkins et al. 2000 for a recent comparison of these mass functions against N-body 
simulations). Note that in this formula the linear variance is a 2 — a 2 j (R 1 ,y), and 'y(R) = — dh\a\ (R) /din R 
is the logarithmic slope of the linear variance as a function of scale. 



2.4. Halo-Halo Correlations 

Following Mo & White (1996) (see also Mo, Jing & White (1997); Shcth & Lemson (1999); Shcth & 
Tormen (1999) for extensions), halo-halo correlations are described by non-linear perturbation theory plus 
a halo biasing prescription obtained from the spherical collapse modclQ. For the PS and ST mass functions, 
we will need 

61 (m) = l + e 1 +E 1 , (14) 
62(171) = 2(1 + a 2 ){e 1 +E 1 ) + e 2 + E 2 , (15) 



5 A treatment of halo biasing beyond the spherical collapse approximation using perturbation theory is given in Catelan et 
al. (1998) 
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b 3 (m) = 6(a 2 + as) (d + E{) + 3(1 + 2a 2 ) (e 2 + E 2 ) + e 3 + E 3 , (16) 
b 4 (m) = 24(a 3 + a 4 )(ei + £ 4 ) + 12 [a 2 + 2(a 2 + a 3 )l (e 2 + £ 2 ) + 4(1 + 3a 2 )(e 3 + E 3 ) + e 4 + E 4 , (17) 



where 



a^ 2 — 1 av 2 , o „. cw 2 



ei = —7 — -, £2 = -^-(ai^ 2 - 3), e 3 = -^-(a 2 i/ 4 - 6av 2 + 3), 



e 4 = (-p-J ( a v -10a^ + 15), 

2 P /<5 C £ 2 ^ l + 2p \ £ 3 ^ 4(p 2 -l) + 6 P ^ 2 2 
Sl - 1 + K 2 )p' Ei = lT£~ + 7 Sx = l £ + 3£l 

£ 4 2av 2 ( a 2 v± \ (l+p) ^ 4(p 2 - 1) + 8(p - l)a^ 2 + 3 2 
~ 2 — loei + 2 — -5 — h bav t\ 



Ei 5 2 V So J S 2 V ^ 

a 2 = -17/21, a 3 = 341/567, and a 4 = -55805/130977. (18) 

All the En's are zero, and a = 1, if n(m) is given by the PS formula. In this case, our formulae reduce to 
those in Mo, Jing & White (1997), although our expression for 6 4 (m) corrects a typographical error in their 
equation (15c). 

Halo-halo correlations read 

P(k; mi ,m 2 ) = 6i(mi)6i(m 2 )Pi(A;), (19) 
S123 (mi, m 2 ,m 3 ) = 6i(mi)6i(m 2 )6i(m 3 )Sf 2 3 + b 1 (m 1 )b 1 (m 2 )b2(m 3 )P L (k 1 )P L (k 2 ) +cyc, (20) 

and similarly for higher-order moments [see Eqs.(fl4|^5|)]. The symbol Ph{k) and -Bf 2 lJ denotes respectively 
the linear power spectrum and the second-order perturbative bispectrum (Fry 1984) 

Bya = 2F 2 (k 1 ,k 2 )P L {k 1 )P L {k 2 ) + eye, (21) 

where F 2 (ki, k 2 ) = 5/7+1/2 cos0i 2 (/ci/fc 2 +fc 2 /A;i)-|-2/7cos 2 812, with fci-fc 2 = k\k 2 cos#i 2 . By construction, 
the bias parameters obey (n = 2, 3, . . .) 

^nfo)&i(y) = 1, / ^-h(y)b n (y) = 0. (22) 
2.5. Results 

Using the ingredients above, we can rewrite the power spectrum and bispectrum as 

P(k) = hh^k)] 2 P L {k) + M 02 {k,'k), (23) 



-7- 



Pl23 = (n M u(**)) B 123+ (A/ll(fel)Mii(fe 2 )M2l(fe3)i 3 L(fcl)i 3 L(fc2)+CyC.) 

i=l 

+ (M n (fci)M 12 (fc 2 , ^3)^(^1) + eye.) + M 03 (fci,fc 2 ,fc 3 ) (24) 

where (bo = 1) 

Mij(ki, . . . ,kj) = y^n(l/)6i(y)[tt(fci,w)...«(Ai,y)](^-) J ~ 1 . (25) 
It is convenient to define the reduced bispectrum Q123, 

Q123 = — , (26) 

which shows a much weaker scale dependence than the bispectrum itself, since at large scales PT predicts 
B oc P 2 , and at small scales the hierarchical ansatz also predicts such a behavior. 

We can also obtain the one-point moments smoothed at scale R from Fourier space correlations by 
integrating with a top-hat window function in Fourier space, W(kR). For example, from Eq. (p3[), the 
variance is 

a 2 {R) = J d 3 kP(k)W(kR) 2 » a 2 L (R) + J ^n{y)y 3 ^(R,y), (27) 

where 

2A f - a 

y^ = ^J k2dk ^ fc ' ^ 2 ( fcii )> (28) 

and we have assumed that J M 2 1 {k)P[ J (k)d 3 kW(kR) 2 w <J 2 L {R) since Mn(fc) — ► 1 as fc — > 0, and at smaller 
scales the power spectrum is dominated by the second term in Eq. (p3|). Similarly the third moment reads 
{Wi = W(hR)), 



<<5 3 (P)> = J d 3 k 1 d 3 k 2 d 3 k 3 d D (k 123 )B 123 W 1 W 2 W 3 



,s;V r o-i(/?i 3a 2 (R) I ^-n(y) bl (y)y 3 u 2 (R,y) + J ^-n(y)y 6 u(R,yW(R,y). (29) 



There are several approximations involved in this result. First, we take the large-scale limit M±i w 1, 
Mi\ ~ 0, valid to a good approximation because of the consistency conditions, Eq. (p2|). In addition, we 
assume that the configuration dependence of the 1-halo and 2-halo terms in Eq.(^) can be neglected (this 
holds very well for the 2-halo term and approximately for the 1-halo term, as we shall discuss below; e.g. see 
bottom right panel in Fig. ^). We can thus evaluate these terms for equilateral configurations, and simplify 
the angular integration by further assuming W{W 2 W\ 2 as W 2 W 2 , the leading-order term in the multipolar 
expansion. With similar approximations, we can derive higher-order connected moments. Define 
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A l3 {R) = J ^-n{y) b % (y) y 3 ^ [u(R,y)Y u 2 (R,y), (30) 
so that a 2 = a\ + A 00 and ( 5 3 ) = Sf T c4 + ia\A w + A i. Then it follows that 

( 5' ) c = Sr*i + A w + 7^ A u + A 02 , (31) 

( <5 5 >c = + W^U W + 25^-^1^ + 15^U 12 + A 03 , (32) 

where the terms in (S n ) c are ordered from n-halo to 1-halo contributions. The coefficient of an m-halo 
contribution to (S n ) c is given by s(n,m) (e.g. 6 and 7 in the second and third terms of Eq. |3l[]), the 
Stirling number of second kind, which is the number of ways of putting n distinguishable objects (S) into m 
cells (halos), with no cells empty (Scherrer & Bertschinger 1991). Thus, in general we can write 

n-l 

( S n ) c = S^af 1 ^ + s(n, m) a nm S?J a^ m ~ l) A ln ^ + A „_ 2 , (33) 

m=2 



where the first term in Eq. (|33j) represents the n— halo term, the second the contributions from m-halo 
terms, and the last is the 1-halo term. The coefficients a nm measure how many of the terms contribute as 
Ain-m— i, the other contributions being subdominant. For example, in Eq. ( |3l| ) the 2-halo term has a total 
contribution of 7 terms, 4 of them contain 3 particles in one halo and 1 in the other, and 3 of them contain 
2 particles in each. The factor 4/7 is included to take into account that the 3—1 amplitude dominates over 
the 2 — 2 amplitude. Note that in these results we neglected all the contributions from the non-linear biasing 
parameters in view of the consistency conditions, Eq. (p2f). When neglecting halo-halo correlations, Eq.(^) 
reduces to those in Sheth (1996) in the limit that halos are point-size objects (u(k,y) — 1). 

For the perturbative values, we use (Bernardeau 1994) 

opt 34 PT 60712 62 7 2 

* 3 =y-7, S 4 = W -y7+ 37, (34) 
PT _ 200575880 1847200 6940 2 235 3 

5 3 5 6 1 3 3 965T 7 + ~6lT 7 ~ ~TJ 1 ' ( ' 

where for simplicity we neglect derivatives of 7 with respect to scale, which is a good approximation for 
R < 20 Mpc/h. 

Before we compare these predictions for dark matter clustering with numerical simulations, it is im- 
portant to note that, within this framework, there are many ingredients which can be adjusted to improve 
agreement with simulations. Rather than exploring all possible variations, we have chosen to always use 
the NFW halo profile and the dependence of the concentration parameter on mass given earlier, and only 
change the mass function between PS and ST; it turns out that these two models usually bracket the results 
of numerical simulations. Other choices are considered in Seljak (2000), Ma & Fry (2000), Cooray & Hu 
(2000). The sensitivity of the results to these choices reflects the underlying uncertainty in this type of 
calculation. As numerical simulation results converge on the different ingredients, however, the predictive 
power of this method will increase. 
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3. Comparison with Numerical Simulations 

We have run two sets of N-body simulations using the adaptive P 3 M code Hydra (Couchman, Thomas, 
& Pearce 1995). Both have 128 3 particles and correspond to a ACDM model (f2 = 0.3, f^A = 0.7) with 
(78 = 0.9. The first set contains 14 realizations of a box-size 100 Mpc/h, and softening length Z so ft = 100 
kpc/h. The second set has 4 realizations of a box-size 300 Mpc/h, and i so ft = 250 kpc/h, which allows us 
to check for finite volume effects. We have also studied the effects of changing the softening, number of 
time-steps and as described below. 

3.1. Bispectrum Measurements: Finite Volume Effects 

Figure |l| shows the results on the reduced bispectrum for equilateral triangles as a function of scale. 
The square symbols show the measurements from the "large" box (Lbox = 300 Mpc/h) whereas the triangle 
symbols show the measurements from the "small" box (Lbox = 100 Mpc/h). Error bars are obtained from 
the scatter among 4 and 14 realizations, respectively. The disagreement between the results of the large 
and small boxes is a result of finite volume effects; the bispectrum is much more sensitive to the presence 
or absence of massive halos than the power spectrum (we will quantify this below), so the smallness of the 
small box is important. Note that the total volume in the 14 small-box realizations only adds up to about 
half of the volume of a single large-box realization. This translates into a large scatter among realizations 
of the small box; 3 such realizations are shown as solid lines in Fig. ^. Note that not only the amplitude 
of Q oq but also its dependence on scale fluctuates significantly from realization to realization, so one must 
interpret measurements made using only a small number of small volume simulations very cautiously. A 
similar situation holds for higher-order moments (e.g. Colombi, Bouchet & Hernquist 1996), as we shall see 
below. 

As is well known, the distribution of higher-order statistics is non-Gaussian with positive skewness (e.g. 
Szapudi & Colombi 1996; Szapudi et al. 1999); the mean value of a higher-order statistic is a consequence 
of having a few large excursions above the mean, with most values underestimating the mean. This is 
exactly what we see in the small box realizations: most of them are closer to the bottom solid line than 
the top one in Fig. [|. Even fourteen realizations of the small box are not enough to recover the correct 
answer given by the large-box mean (square symbols). In other words, the skewness of the Q distribution 
makes convergence towards the true value much slower for the small-box simulations (Szapudi & Colombi 
1996; see also Scoccimarro 2000 for the bispectrum case). Notice that this is not bias in a statistical sense: 
given sufficient number of realizations, the mean will always converge to the true value; it is just that the 
convergence is slow. It is also important to note that when measuring Q from multiple realizations, one 
should always obtain the average of B and the average of P separately from the realizations, and only at 
the end divide to obtain Q (which is what we have done in making Fig. |l|). Otherwise, a "ratio" bias would 
result (Hui & Gaztanaga 1999), and the skewness measurements from the small box simulations would be 
lower than are shown in Fig. |l|. Such an estimator bias will certainly affect measurements of Q from, say, a 
single realization of a size similar to our small box. 

Figure [l] also shows the predictions of (tree-level) PT, which agree very well with the large-box results 
at large scales, as well as the predictions of hyperextended PT (HEPT; Scoccimarro & Frieman 1999), which 
has been proposed as a description of clustering in the non-linear regime (k > 1 h/Mpc). The agreement 
with HEPT is good up to the resolution scale of our simulations, which we estimate as k w 4 h/Mpc. It 
is not straightforward to assign a resolution scale in Fourier space (i.e. it is not just 2ir/l so ft, since a given 
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Fourier mode has contribution from a range of scales. Two-body relaxation causes the two-point correlation 
function to be underestimated at scales comparable to i so ft, which in turn implies an overestimate of the 
reduced bispectrum Q. To illustrate this, we have run the same realization after halving l so [ t to 50 Kpc/h 
(dashed line in Fig. [j]); this shows that beyond k » 4 h/Mpc the bispectrum results are sensitive to the 
resolution, so we only plot the small-box results up to this scale. Similarly, we only show results for the 
large box up to k « 2 h/Mpc. For our particle number, Z so ft cannot be pushed too much smaller than the 
values we used, else there would be not enough particles in a cell of radius Z so ft to satisfy the fluid limit. We 
have also checked the sensitivity of our results to changes in the number of time steps used in the N-body 
integrator and found no difference. Changing the density parameter f2 leads to extremely small differences 
in the reduced bispectrum Q, we thus present results only for ACDM. This is true not only in the weakly 
non-linear regime, but also in the non-linear regime, as expected from the general nature of the Q dependence 
in the equations of motion (Scoccimarro et al. 1998). 

Recently Ma & Fry (2000) claimed that the hierarchical ansatz is not obeyed in N-body simulations. 
They based their claim on analysis of one single realization — the equivalent to just one of our small boxes. 
Their measurement approximately follows the lower solid line in Fig. [l], which, as we have shown, is seriously 
affected by finite volume effects (we will quantify these effects shortly) . To reliably test the hierarchical ansatz 
at smaller scales than probed here, one must resort to higher resolution simulations, preferably keeping the 
box size as large as possible to avoid finite volume effects. For example, a 300 Mpc/h box 512 3 particle 
simulation would be able to probe up to k w 10 h/Mpc reliably. 



3.2. Comparison with Predictions 

The top panel in Fig. |^ shows the ratio of the power spectrum in our two models (PS and ST mass 
functions) to the power spectrum fitting formula (Hamilton et al. 1991; Jain, Mo & White 1995; Peacock 
& Dodds 1996) as a function of scale k. The dashed (dotted) lines show the contributions to the 1-halo 
term in the PS (ST) case from halos having masses in the range 10 < m/m* < 100, 1 < m/rn* < 10, and 
0.1 < m/m* < 1, from left to right (m* = 1.07 x 10 13 Mq/1i). As the PS mass function has more halos 
than the ST one when m < 40m*, the 1-halo term is enhanced. Note the dip in both predictions at k w 0.5 
h/Mpc, where the amplitude of 1-halo and 2-halo terms are comparable. This is due to our treatment of 
the 2-halo term; we approximate it by simply using linear PT. In practice, non-linear corrections enhance 
this term at scales smaller than the non-linear scale, k « 0.3 h/Mpc. However, when including this term 
one must also take into account exclusion effects (halos cannot be closer than their typical size), otherwise 
the power spectrum at intermediate scales would be overestimated. Since exclusion effects are non-trivial 
to compute (though Sheth & Lemson 1999 suggest how one might do so), the simplest solution is to ignore 
these effects, because they approximately cancel each other. This is a reasonable approximation because at 
the scales where exclusion effects become important, 1-halo contributions dominate. 

The bottom panel in Fig. |^ shows the prediction of halo models for the reduced bispectrum for equilateral 
configurations as a function of scale (solid lines). For the ST case we also show the partial contributions 
from 1-halo, 2-halo and 3-halo terms in dashed lines, which dominate at small, intermediate and large scales, 
respectively. For the PS case we show in dotted lines the contributions to the 1-halo term in Q when the 
bispectrum is restricted to the mass range 10 < m/m* < 100 (which dominates at all scales shown in the 
plot) and 1 < m/rn* < 10. In this case, when taking the ratio in Eq. (^6|), we have used the full power 
spectrum. As expected, comparing the two panels we see that at a given scale the bispectrum is dominated by 
larger mass halos than the power spectrum. For the bispectrum at k « lh/Mpc, this implies that halos with 
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to > 40to* contribute more significantly, thus leading to a higher Q. At smaller scales, say k « lOh/Mpc, the 
PS mass function has more halos of the relevant masses (to < 40to„), so the bispectrum is larger for PS than 
ST (by a factor slightly smaller than the ratio of power spectra), thus the reduced bispectrum Q is higher 
again for the ST case. We have also varied the concentration parameter to test the sensitivity of our results. 
Doubling the concentration parameter (with the same mass dependence) leads to a significant increase of 
the power spectrum at small scales (this consistent with Seljak 2000) and increases Q by about 10% at 
small scales. On the other hand, changing the scaling of the concentration parameter by a factor of two to 
c(to) = 9(to/to*)~ ' 26 , decreases (increases) the power spectrum at scales where to/to* > 1 (to/to* < 1) 
contributes, as expected. For the bispectrum, larger concentration leads to higher Q, although at a given 
scale larger masses contribute than for the power spectrum, so the effects are shifted in scale with respect 
to the power spectrum case. 

Figure [| compares these results with the measurements in the numerical simulations presented in Fig. [l]. 
We see that generally there is good agreement between predictions and the simulations; the simulations seem 
to be roughly in between the PS and ST predictions. At small scales, the halo models predict that Q eq 
increases rapidly with k; the limited resolution of our simulations prevents us from testing this particular 
prediction reliably. As we discussed above, hnite volume effects can be significant when dealing with the 
bispectrum. To quantify this, we have calculated the halo model predictions for cases when the maximum 
halo mass is set to TO max = 5.9 x 10 14 Mq/1i and TO max = 6.8 x 10 14 MQ/h for PS and ST respectively (dot- 
dashed lines) and TO max = 10 14 Mq/1i (dashed lines). These values are those for which the mass functions 
would predict just one halo with mass larger than m max in a (100Mpc/h) 3 volume; but since these are 
cumulative numbers and both mass functions actually overestimate the number of halos when compared to 
simulations (more so PS), a smaller number, such as TO max = 10 14 Mq/1i is perhaps a more reasonable cutoff. 
In any case, we see that the predictions change significantly; in particular, introducing such a cutoff makes 
the scale dependence of Q much more like that seen in most of the realizations of the small box (bottom 
solid line in Fig. |l|). 

One key element in the halo model is that we are using the spherical average (rather than the actual 
shapes) of halo profiles. On the other hand, it is well known that halos found in N-body simulations are not 
spherical, but rather triaxial (Barnes & Efstathiou 1987; Frenk et al. 1988; Zurek, Quinn & Salmon 1988). 
The bispectrum is the lowest-order statistic which is sensitive to the shapes of structures, so one expects to 
find differences for the bispectrum as a function of triangle shape at small scales where halo profiles (1-halo 
terms) determine correlation functions. Figure ^ shows such a comparison at different scales, for triangles 
where k^ = 2fci, as a function of angle between k\ and &2- The top left panel shows that, at large scales, 
the bispectrum agrees reasonably well with simulations; this is of course by construction, since non-linear PT 
holds. At smaller scales (top right panel), however, the predictions become independent of triangle shape at 
scales where there is still noticeable configuration dependence. In fact, this is understood from the bottom 
right panel which shows the partial contributions for the ST case. We see that the 1-halo term (which is 
determined by the halo profiles) has the opposite configuration dependence than the 3-halo term, which 
comes from non-linear PT. 

The fact that Qih is convex can be understood from the spherical approximation. If halos were exactly 
spherical and Q were scale-independent, then one would expect the maximum of Q to occur for equilateral 
configurations. When k^ = 2fci the closest configurations to equilaterals are isosceles triangles with 8 0.67T. 
Aside from an overall slight scale dependence (9 = configurations are somewhat more non-linear than 
6 = 7r), we see that this is indeed the case. Notice also that, if the contribution Qih were flat, the residual 
configuration coming from would be enough to produce agreement with the N-body simulations. At even 
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smaller scales (bottom left panel), the numerical simulation results become approximately flat, but the halo 
models predict a convex configuration dependence due to the fact that Q\h dominates. From these results we 
conclude that although halo models predict bispectrum amplitudes which are in reasonable agreement with 
simulations, the configuration dependence is in qualitative disagreement with simulations. Of course, if we 
knew the actual halo shapes, then they could be incorporated into the models (at the expense of complicating 
the calculations!). 

Finally, in Fig. |^, we compare counts-in-cclls measurements of the higher-order moments of the smoothed 
density field in our simulations with the predictions of halo models. Symbols and error bars are as in Fig.^: 
the top (bottom) solid line in each case corresponds to the ST (PS) prediction. The dashed lines show the 
predictions of HEPT (Scoccimarro & Frieman 1999), and the vertical lines show the softening scale of the large 
and small box. The disagreement of the average of 14 small-box measurements with the large-box average is, 
again, a manifestation of finite volume effects. As expected, the difference becomes increasingly important 
for the higher order moments. Despite the many approximations made in the calculations of S p parameters 
in halo models, the agreement with simulations is quite good. We also see a very good agreement with the 
HEPT predictions, and that the scales where halo models predict a strong scale dependence are beyond the 
limits of our resolution. This also confirms that our prescription for the resolution limit in Fourier space was 
reasonably accurate. Thus, contrary to Ma & Fry (2000), we conclude that higher-resolution simulations in 
bigger boxes are essential if one is to test models of the higher-order correlations reliably. 



We now turn to a discussion of how to use the halo models described above to predict the clustering 
of galaxies. Our treatment follows ideas present in the semianalytic galaxy formation models (Benson et al. 
2000; Kauffmann et al. 1999) and has been also explored by Seljak (2000) and Peacock & Smith (2000) for 
the case of the power spectrum. 



To describe galaxy clustering, we need to know the distribution, the mean and the higher-order moments, 
of the number of galaxies which can inhabit a halo of mass m. In addition, we need to know the spatial 
distribution of galaxies within their parent halo. To illustrate our model predictions, in what follows we 
will assume that the galaxies follow the dark matter profile (we will discuss what happens if we change this 
requirement shortly). This implies that Eq. (R) for galaxies reads 



4. Galaxy Clustering 



4.1. Galaxy Correlation Functions 




(27r) 6 / u mi (k)n(mi)(N ga i(mi))dmi / u m2 (k)n(m 2 ) ( N gal (m 2 ) ) dm 2 P(k; m 1 , m 2 ), (36) 



where the mean number density of galaxies is 




(37) 
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Thus, knowledge of the number of galaxies per halo moments (N™ al (m) ) as a function of halo mass gives 
a complete description of the galaxy clustering statistics within this framework. Note that when the mean 
number of galaxies per halo drops below unity, one must use u m (k) = 1, the point-size halo limit, since in 
this case there is at most a single galaxy (which we assume to be at the center of the halo). 

The results for galaxy power spectrum and bispectrum follow those of the dark matter in Eqs.(|2 
after changing My in Eq. ( p5|) to , where 

£,#!,...,%) ee -1 J ^n(y) bl ( y )[ u (k 1 ,y)...uCk J ,y)](^y~ l{ -^^. (38) 
Note that in the large-scale limit, the galaxy bias parameters are 

:^ g ai) 



h = Gu « tr- / —n{y)bi(y)- 
n g J y 



>9 

Similarly, the galaxy one-point connected moments satisfy 



and 



(39) 



* 2 g = (*l) g ,l + B 00 , (6 3 g ) c = (5 3 PT )gaiK)gai + 3(ai) gal i? 10 + Soi, (40) 
(5 4 g )c = (S 4 PT )gaiK)gai + 2S' 1 Vl) galJ B 10 + 4(a|) galJ B u + B m , (41) 



( 6 5 g ) c = (S PT ) gal K) gal + gSf (a£) gal S 10 + 155f (a|) galJ B u + 5(a|) gal 5 12 + S 03 , (42) 

where 

B l3 {R) ee — ^ / -ln(y) bM ^ +1 > [u(R,y)Y v?{R,y)L^L, (43) 

biUg J y mJ 

and the perturbative moments are given by their local bias counterparts (Fry & Gaztahaga 1993) 

(<7|) ga i = bjal (5 3 PT ) gal = l(s 3 PT + 3c 2 ), (5 4 PT ) g a 1 = ^(^4 PT + 12c 2 ^ PT + 4 C3 + 12 C 2), (44) 
and 



(5 5 PT ) ga i = ^ (S PT + 20c 2 S PT + 15c 2 (S PT ) 2 + (30c 3 + 120c 2 )S 3 PT + 5c 4 + 60c 2 c 3 + 60cf) , (45) 

where Ci = bi/b\ and bi are the effective bias parameters in Eq. (|3^). The halo- halo skewness and kurtosis 
are given by these expressions upon replacing Cj by CiB^j B\^. 
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4.2. Galaxies 

As a first example, we use the results of the semi-analytic models of Kauffmann et al. (1999); these 
N-body simulations, halo and galaxy catalogues are publically available. Sheth & Diaferio (2000) show that 
in these catalogues, the mean number of galaxies N ga i per halo of mass m are well fit by 

(N gal ) = (N B + N R ), (N B ) = 0.7(m/m B ) aB , ( N R ) = (m/m R ) a «, (46) 

where N B and N R represent the number of blue and red galaxies per halo of mass m, and a B = for 
10 n M Q /h <m< m B , a B = 0.8 for m > m B , m B = 4x 10 12 M Q /h, a R = 0.9, and m R = 2.5 x 10 12 M Q /h 
(no lower mass cut-off for R). The physical basis for this relation is as follows. At large masses, the gas 
cooling time becomes larger than the Hubble time, so galaxy formation is suppressed in large-mass halos, 
therefore ( iV ga i ) increases less rapidly than the mass. In small-mass halos, however, effects such as supernova 
winds can blow away the gas from halos, also suppressing galaxy formation, thus the cutoff at small masses. 

To calculate the power spectrum and higher-order statistics we also need the second and higher-order 
moments of N ga \. The second moment is also obtained from the semi-analytic models, and obeys 

(iV gal (iVgai-l))^a 2 (m)(iV gal ), (47) 

where the function a(m) quantifies deviations from Poisson statistics a(m) « log \J m/mn (ran = 10 11 Mq//i) 
for m < 10 13 Mq//i and a(ra) = 1 otherwise; that is, for large masses the scatter about the mean number of 
galaxies is Poisson, whereas for small masses it is sub-Poisson. 

To model the higher-order moments we will assume that the number of galaxies in a halo of mass m 
follows a binomial distribution: 

p(N gal = n\m) = (^j p n m {\ - y m f™- n . (48) 

The binomial distribution is characterized by two parameters, N m and p m , which wc set by requiring 
that the first two moments of the distribution equal those from the semianalytics. Specifically, the first and 
second factorial moments are M m p m and N m Pm(N m p m ~ Pm), and we require that they equal (-/V ga i ) and 
(N ga \(N ga i — 1)), respectively. One can think of J\f m as the maximum number of galaxies which can be 
formed with the available mass m, and of p m as the probability of actually forming a galaxy. For a constant 
NmPm, the small p m limit gives a Poisson distribution. Af m is an increasing function of mass, whereas p m 
peaks at m w 10 12 Mq//i with p m « 0.8. The higher-order factorial moments are completely determined 
once the first two moments have been specified; they obey 

( AT gal (AT ga] - 1) . . . (7V gal - j) ) = a 2 (2a 2 - 1) . . . (ja 2 - j + 1) ( N gai T . (49) 

In this model, all the moments become Poisson-like at the same mass scale, i.e. when a(m) — 1, all the 
factorial moments become Poisson, (N gei \(N ga \ — 1) . . . (A^ai — j) ) = (N ga \ ) J+ ■ However, at small scales, 
where small halos contribute, the galaxy counts per halo are significantly sub-Poisson. Our binomial model 
provides a simple way of accounting for this. To correct the power spectrum and bispectrum for shot noise 
we use the same form as in the Poisson case (Peebles 1980): 
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P c (k) = P(k)-e, B c 123 = B 123 ~e(PZ + PZ + P^)-e 2 , (50) 

but where the parameter e is set to e = P(fc — > oo), to avoid making the corrected power spectrum negative 
at small scales. In the Poisson case, e -1 = (27r) 3 n g , we find that in our prescription e can be smaller than 
the Poisson value by almost a factor of two. Although this model is somewhat arbitrary, our results are 
insensitive to shot noise substraction for k < 5h/Mpc and smoothing scales R > IMpc/h. 

Figure || shows the resulting galaxy correlation functions. The top panel shows the predictions of the 
ST (dot-dashed) and PS (solid) mass functions for galaxies using the relations given above. For comparison, 
we also show the mass power spectrum predicted by the fitting formula of Peacock & Dodds (1996). Note 
how the galaxy power spectrum is well approximated by a power law, even though the dark matter spectrum 
is not. This is due to the fact that galaxy formation is inefficient in massive halos; this suppresses the 1-halo 
term compared to the dark matter case. In addition, at small scales the sub-Poisson statistics of galaxy 
counts per halo suppresses the galaxy spectrum relative to the dark matter. The dotted line in the top 
panel shows how the predictions change for the ST model when the low-mass cutoff in the iV ga i(m) relation 
is raised from m cut = 10 11 Mq/1i to m cut = 10 1:L5 -Mq/1i. At large scales, suppressing low-mass halos leads 
to an increase of the bias factor from b\ — 0.86 to b\ = 1.12. At small scales, since these low-masses do 
not contribute significantly, the overall amplification of the 1-halo term is due to the lower galaxy number 
density, which decreases by almost a factor of two. Therefore, our galaxy clustering predictions are quite 
sensitive to the low-mass cutoff for galaxy formation. 

The bottom panel in Fig. [6] shows the galaxy reduced bispectrum Q ga i for equilateral configurations 
as a function of scale (solid lines), compared to the dark matter value (dashed) for the ST mass function. 
At large scales, the halo models predict a negative quadratic bias, which suppresses the bispectrum. At 
smaller scales, both the power spectrum and bispectrum are suppressed, but at a given scale the bispectrum 
is sensitive to larger mass halos than the power spectrum, so <5 ga i rises more gradually than Qdm- At 
small scales k > lh/Mpc, the different mass weighing of the galaxy bispectrum leads to a suppression 
of the scale dependence of Q. The dot-dashed curve shows again the sensitivity to the low-mass cutoff 
(fi cu t = 10 115 Mq/1i); as expected, the distribution at small scales is more biased. To be more quantitative, 
let's note that at small scales, where 1-halo terms dominate, the scaling of the p-point spectrum is 

r ( N p , ) 

T p {k) ~ / n(m) m p [u m (jfc)]P sa " dm, (51) 

and at small masses to <C to*, n{m) ~ mr' 1 v a [m\ c(m) ~ m~ h and u m (k) ~ [sin/t Sic(ft) — cos k Ci(k)]/ lnc, 
where Sic(«) = ir/2 — Si(«). This expression for the profile follows when c^> 1, at large k, u m (k) ~ (ftlnc) -1 . 
Furthermore, if v(m) ~ to/™ +3 )/ 6 as in the scale-free case, and (iVg al } ~ mP^ 1 ~' i \ changing variable from to 
to n, we have 

Tp(fc) ~ fc[6(l-P+pe)-(n+3)a]/[2(36+l)] ) ( 52 ) 

where we have neglected the weak logarithmic dependence of the profile on the concentration (which leads 
to additional suppresion at small scales) and used that the integral over k converges. This means that the 
reduced spectra Q p (k) ~ T p (fc)/[P(fc)] p ~ 1 scale as Q p (k) ~ k 1 " with 

a(n + 3)-6e 

> = (P ' 2) 2(36+1) ■ (53) 
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This generalizes the derivation in Ma & Fry (2000b) for galaxies. The validity of the hierarchical ansatz, 
7 P w 0, thus depends on the low-mass behavior of the concentration parameter, mass function and N ga i. 
For scale-free initial conditions, the validity of the hierarchical ansatz for the mass correlation functions 
is linked to that of stable clustering (Peebles 1980), although for higher-order correlation functions stable 
clustering takes a different form than the usual two-point requirement that pairwise peculiar velocities cancel 
the Hubble flow (Jain 1997). We see that for p = 3 (the bispectrum), a = 0.4 (ST mass function), e = 0.1, 
n c ff w —2.5 (fc = 3 h/Mpc), and b = 0.13, 73 rs —0.3, which agrees approximately with the behavior of 
Qgai in Fig. ^. This is only qualitative, due to the many approximations involved. On the other hand, 
it captures the general behavior of mass weighing, for b > 0, supressing contributions from massive halos 
(e > 0) preferentially places galaxies in more concentrated halos, thus at a given k one is probing outer 
regions of the NFW profile compared to the mass case. Since in the outer regions u(r) ~ r~ 3 , Q ga i is less 
scale dependent than Q. 

Figured shows the S p parameters as a function of smoothing scale R for the mass (dashed; same as 
ST in Fig ||), the galaxies with TO cut = 10 11 MQ/h (dot-dashed) and the galaxies with m cut = 10 n ' 5 MQ/h 
(solid). Although we see the same general behavior as in the bispectrum case, it is interesting that the 
galaxy S p parameters are smaller than the dark matter ones at small scales, which is similar to the trend 
seen in comparisons of dark matter predictions with real galaxy catalogs. Both galaxy plots assume that 
the galaxy per halo moments obey Eq. (fl9|). We have repeated the calculation assuming Poisson statistics 
(a(m) = 1) and found that the results are the same for scales R > 5 Mpc/h, and that at R = 1 Mpc/h the 
Poisson values are about 15% below those shown in Fig. 0. The sensitivity of the clustering to the details 
of the relation of the number of galaxies as a function of halo mass can be used to probe aspects of galaxy 
formation, as we now discuss. 



5. Comparison with APM Survey: Constraints on Galaxy Formation 

Measurements of two-point and higher order moments of the galaxy field in the APM survey (Maddox 
et al. 1990) provide important constraints on models of galaxy clustering. Here we use the measurements 
of counts-in-cells, deprojected into three dimensions, by Gaztahaga (1994,1995) to constrain the N ga \(m) 
relation. 

As discussed above, galaxy clustering at large scales is given by the standard local-bias model, with bias 
parameters obtained from Eq.(p9|). To constrain the iV ga i(m) relation, we use the parametrized form 

( Ag a i) = {m/m ) ai , m cut < m < m , ( N ga \ ) = (m/m ) a2 , m>m , (54) 

with ( iVg a i ) = for m < m cut and the second moment obeys Eq. ( [47j ) with 



log m/m cut 

a ( m ) = ] — 7 1 7, (55) 

log(r7io/m C ut) 

for m < mo and a(m) — 1 for m > m . For the higher-order moments we adopt the binomial model in 
Eq.(|49"l). Requiring that p m in the binomial model to be positive implies that a\ > 0. Note that in Eq. ( |54| ) 
we have set ( N sa \ ) = 1 at m = mo since clustering statistics do not depend on the overall number density 
of galaxies; however, constraints such as the luminosity function and the Tully-Fisher relation are sensitive 
to the overall amplitude of the N s& \(rn) relation. 
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For the measured values in APM at large scales, we use that a 2 g = 0.167 ± 0.021 at R — 20 Mpc/h, and 
for the higher-order moments we use (S^gai = 3.3 ± 0.5 and (S4) ga i = 15 ± 4 at R — 20 Mpc/h. The latter 
are fiducial values obtained by averaging over the large scale {R > 10) skewness and kurtosis, rather than 
the value at a particular scale. The skewness at large scales is rather flat so averaging is reasonable (see 
Fig. |l0| ), for the kurtosis the large scale limit is not so well defined, but in practice this constraint is not very 
important because it comes with large error bars. 

The constraint on the variance at large scales sets the linear bias parameter, 1 < b\ < 1.15, whereas the 
skewness also depends on the non-linear bias b 2 . In halo models, b\ and hi are not independent — for a given 
mass function they have a specific relation as a function of halo mass, Eqs. (p^-^5[), shown in Fig.^ in terms 
of the threshold parameter v = 5 c /a(m). As a result of this relation, it turns out to be non-trivial to satisfy 
both a 2 and (<S , 3) ga i constraints simultaneously. Essentially, since galaxies in cluster normalized ACDM are 
constrained by the APM variance to be almost unbiased at large scales (&i « 1), a high skewness (as high 
as the mass skewness, shown as dashed lines in Fig. |l0|) requires that b 2 ~ 0, which is not easy to obtain if 
galaxy formation is inefficient at small and large halo masses (the latter is required to suppress the 1-halo 
term contribution to the variance and match the observations at small scales). To quantify constraints on 
the iVg a i(m) relation, we run a Monte Carlo with varying parameters for the N ga i(m) relation, 



10 9 M Q /h < m cut < 10 13 M Q //i, m cut < m < m cut x 10 4 , -1 < ai < 4, < a 2 < 1.5, (56) 

and use the bias parameters from Eq.(|3^) in Eq.(Q) to decide whether a given model is accepted. We 
considered both PS and ST mass functions. For the mass, we use that at R = 20 Mpc/h, a 2 = 0.145, 
S3 = 2.9, and S4 — 14.6. Note that the inferred deprojected variance of the APM corresponds to a median 
redshift z — 0.15, so we extrapolate the predictions from ACDM a% = 0.90 to this redshift. 

If we impose no further constraint on the high-mass slope a 2 , we find that the maximum skewness (at 
R = 20Mpc/h) is S3 — 3 for the ST mass function (with a 2 = 1) and S3 = 3.1 for the PS mass function 
(with 02 = 1.1). However, such a high value for a 2 means that 1-halo terms are not suppressed with respect 
to the mass, and thus the variance and skewness at scales R pa 1 — 5Mpc/h are much larger than observed. 
If we restrict a 2 < 0.9, we find that the maximum skewness for the PS mass function becomes S3 = 2.4, 
uncomfortably small for APM galaxies (but see below for discussion of APM deprojection issues). For the ST 
mass function we find that values as high as S3 = 2.64 are possible (with S4 = 12.6), this requires in addition 
that a± w a 2 = 0.9 and m cut < 10 10 Mq/H, so galaxy formation is relatively efficient in small-mass halos. 
This values are insensitive to mo, as a\ w a 2 and the large-scale clustering is insensitive to the distribution of 
galaxies within halos. Essentially, in this model galaxies trace as much as possible the dark matter. However, 
as shown in Figs. (§jic]) in solid lines (m cut = 8 x 1O 9 M //i, m = 6 x 10 w M Q /h, ax = 1.2, a 2 = 0.9), 
the small-scale variance and skewness are overestimated in this model. So, further suppression of galaxy 
formation in high-mass halos is required, a 2 < 0.9. Similarly, the galaxy model of Eq.(^) (dot-dashed lines 
111 Figs. (10)) suffers fr om the same problem for the skewness. 

Before we turn to constraints derived from small-scale clustering, we should note that these depend on 
at least two additional assumptions (rather than just the N g& \(m) relation). First, we are assuming that 
galaxies trace the dark matter profile. Fig. 2 in Diaferio et al. (1999) shows that this cannot be true for 
both red and blue galaxies. The blue semianalytic galaxies, which should be more like the ones in the APM 
survey, are preferentially located in the outer regions of their parent halos. In the semianalytic models, 
this happens because, in the time it takes for a galaxy's orbit to decay (by dynamical friction) from the 
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edge of its parent halo to the centre, its stars age, so its colour changes from blue to red. Blue galaxies on 
approximately radial orbits which might pass close to the halo centre spend most of their time far from the 
centre anyway. As a result, most galaxies near the halo centre are red, and those further out are blue. To 
mimic this effect, we have studied what happens to our model predictions if we decrease the concentration 
parameter by a factor of two; as explained above, this decreases the variance but it does not change the ratio 
of moments such as the skewness appreciably, so our results seem robust to this effect. 

Second, recall that we are assuming that halo-halo correlations are well described by leading order 
PT. In the case of the clustering of dark matter, this was well justified because at scales where exclusion 
effects become important (invalidating the extrapolation from PT), 1-halo terms dominate over halo- halo 
correlation contributions. For galaxies this may not be true anymore, particularly since many small mass 
halos host at most one galaxy, so the 1-halo term from these halos is suppressed. For example, we find 
that at 3 Mpc/h, the contributions of 1-halo and 2-halo terms to the variance and the third moment of the 
galaxy counts are comparable, whereas this scale is about 5 Mpc/h for the dark matter. If one assumes 
that exclusion effects suppress the contribution of 2-halo terms (see e.g. Fig. 5 in Mo & White 1996, or the 
analytic treatment of exclusion effects in Sheth & Lemson 1999), one might conclude that the skewness is 
higher than what we get by including 2-halo contributions using PT, since these affect more the square of 
the variance than the third moment. On the other hand, Mo, Jing & White (1997) show that, on scales 
smaller than the non-linear scale, our formulae for the 3-halo and 4-halo terms significantly overestimate the 
skewness and kurtosis of the haloes in their simulations (see the bottom right panels of their Figs. 3 and 4). 
Therefore, although these exclusion effects may conspire to approximately cancel out in the end, we should 
bear in mind that non-trivial behavior from exclusion effects can invalidate our conclusions from clustering 
statistics at small scales. 

Modulo these caveats, if we impose the additional constraint that at small scales, e.g. R = IMpc/h, 
(<S , 3)gai 6, we find models with parameters 

TOcut = 4 x 10 9 M Q /h, m = 8 x 10 11 M Q /h, a x = 1, a 2 = 0.8, (57) 
TOcut = 2.5 x 10 10 M Q //i, m = 10 12 M Q //i, ai = 0.8, a 2 = 0.8, (58) 
m cut = 6xl0 10 M Q //i, m o = 1.2xlO 12 M //i, a x = 0.6, a 2 = 0.8, (59) 

the first of which is shown in Figs. p[jlo| as dotted lines (the others have very similar behavior). However, all 
these models predict a small scale variance which is too large; in fact in Fig. ^ we have used a concentration 
parameter a factor of two smaller than Eq.(|l^) to decrease the small-scale variance . We were unable to 
find a model which matched both the variance and skewness of the APM survey at all scales. Therefore, 
we conclude that the skewness provides a stringent constraint on models of galaxy formation. In particular, 
the relatively large skewness at large scales and relatively small skewness at small scales provide opposite 
requirements on the number of galaxies per halo for massive halos. The "large" value of the skewness at 
large scales requires that galaxies trace mass; however the small value of the skewness at small scales requires 
that galaxy formation be suppressed in massive halos. 

In deprojecting from angular to three-dimensional space, Gaztahaga (1994,1995) assumed the validity of 
the hierarchical ansatz for the three- and higher-order correlation functions. At large scales, however, PT pre- 
dicts that the hierarchy of correlation functions is not a simple hierarchical model with constant amplitudes, 
but rather the amplitudes depend strongly on the shape of the configuration (i.e. the bispectrum depends 
on the triangle configuration). This can affect the deprojection (Bernardeau 1995; Gaztanaga & Bernardeau 
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1998); in particular it can lower the three-dimensional skewness and higher-order moments deduced from 
the angular data, perhaps by as much as 20% at R > lOMpc/h (Gaztanaga, private communication). At 
large scales, at least, this would improve the agreement with halo-model predictions. 

6. Conclusions 

We used the formalism of Schcrrer & Bertschinger (1991) to construct an analytic model of the dark 
matter and of galaxies. For the dark matter, we find that the predictions are in good agreement with numeri- 
cal simulations, except for the configuration dependence of the bispectrum at small scales. This (numerically 
small) disagreement can be traced to the assumption that halo profiles are spherically symmetric; in reality 
the halos generically found in CDM simulations are triaxial. In general, halo models provide accuracy no 
better than 20% when compared to simulations. However, as N-body results converge on the values of the 
ingredients of halo models (profiles, mass functions, etc.), their predictions will improve. 

We showed how the finite volume of the simulation box can significantly affect the results of higher- 
order statistics, due to the fact that small boxes have a deficit of massive halos. In particular, we showed 
that, by making suitable cuts in halo mass function, we were able to reproduce the observed behavior of 
the bispectrum in small-box (100 Mpc/h) simulations. At small scales, halo models predict a significant 
departure from the hierarchical scaling, unless the low-mass dependence of the concentration parameter, 
the mass function, and the small-scale slope of the halo profile are different from the currently accepted 
values. Unfortunately, the limited resolution of our simulations cannot test these predictions. We caution 
that Ma & Fry's (2000) conclusion about the breakdown of the hierarchical ansatz in numerical simulations 
is premature as their results are likely to suffer from finite volume effects and inadequate resolution. 

If galaxies in individual dark matter halos trace the dark matter profiles, galaxy clustering is completely 
determined within halo models by specifying the moments of galaxy counts as a function of halo mass. In 
general, suppression of galaxy formation in large-mass halos leads to a power-law like behavior for the galaxy 
power spectrum and higher-order moments which are smaller than for the dark matter. This is similar to 
what is observed in galaxy surveys. However, a quantitative comparison with counts-in-cells statistics in the 
APM survey puts stringent constraints on the galaxy counts as a function of halo mass. At large scales, 
these require that galaxies trace the mass as closely as possible, implying that galaxy formation is relatively 
efficient even in small-mass halos. In addition, the small-scale behavior of the skewness requires a high-mass 
slope for the N ga \(m) relation of about a2 = 0.8, although we found no model which simultaneously fits the 
small scale value of the second moment. The parameters of our "best fit models" are given in Eqs. (|57|-[59"|) . 
These constraints are independent of those derived from the luminosity function and Tully-Fisher relation 
which are sensitive to the overall amplitude of the galaxy counts as a function of halo mass. 

Clearly, halo models provide a useful framework within which to address many interesting questions 
about galaxy clustering, and also related topics such as galaxy-galaxy and quasar-galaxy lensing. Our 
treatment should be of particular interest for the interpretation of clustering in future galaxy surveys, such 
as SDSS and 2dF. In these cases, however, redshift distortions of clustering must also be taken into account. 
We plan to address this issue and others relevant to upcoming galaxy surveys in the near future. 
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0.05 0.1 0.5 1 5 10 

k [h Mpc- 1 ] 

Fig. 1. — The reduced bispectrum Q eq (k) for equilateral configurations as a function of scale. The triangle 
symbols show the average over 14 realizations of box size Lbox = 100 Mpc/h; the 3 solid lines represent 
results for 3 individual realizations. The dashed line denotes the same realization as the companion solid 
line but ran with half the softening length (50 Kpc/h). The square symbols denote the average over 4 
realizations with Lbox = 300 Mpc/h. The disagreement of the large and small box measurements is due to 
finite-volume effects in the latter. 
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Fig. 2. — The top panel shows the ratio of power spectra for the PS and ST mass function to the fitting 
formula for the non-linear power spectrum of Peacock & Dodds (1996) as a function of scale. The dashed 
(dotted) lines show the contributions to the 1-halo term in the PS (ST) case from halos of mass 10 < 
m/m* < 100, 1 < m/m* < 10 and 0.1 < m/m* < 1 from left to right. Bottom panel shows the predictions 
for the reduced equilateral bispectrum; the dashed lines show the individual contributions (for the ST case) 
of 3-halo, 2-halo and 1-halo terms, which dominate at large, intermediate and small scales, respectively. The 
dotted lines show the contributions to the 1-halo term in Q in the PS case from halos in the mass range 
10 < m/m* < 100 and 1 < m/m* < 10. 
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Fig. 3. — Comparison of the N-body results of Fig. [l] (points with error-bars) with the predictions of Fig. || 
(solid lines). The dashed lines show the predictions of halo models when there are no halos of mass larger 
than Tji — 10 14 Mq/Ii in the simulation volume, to illustrate finite volume effects, for both PS (lower 
curve) and ST mass functions. The dot-dashed lines represent the predictions if halos of mass larger than 
777 = 5.9 x 10 14 M Q /h (or m = 6.8 x 1O 14 M //i) are excluded for PS (or ST) mass functions. 
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Fig. 4. — Comparison of the N-body results with the predictions of halo models for the reduced bispectrum 
as a function of triangle shape. Symbols and error bars are as in Fig. [y. Solid lines show the predictions of 
ST (top) and PS (bottom) mass functions. The top panels and lower left panel show different scales, whereas 
the lower right panel shows the partial contributions to the total value, from 1-halo, 2-halo and 3-halo terms 
for the ST case. 
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Fig. 5. — S p parameters (p — 3, 4, 5 from bottom to top) as a function of smoothing scale R. Symbols and 
error bars are as in Fig. [lj Solid lines show the predictions of ST (upper) and PS (lower) mass functions. 
Dashed lines show the asymptotic behavior predicted by HEPT. Again, the disagreement of the small volume 
simulations (triangles) with those of bigger volume (squares) is a direct consequence of finite volume effects 
in the former. The vertical lines denote the softening length of the two sets of simulations. 
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Fig. 6. — The top panel shows the galaxy power spectrum (A(fe) = 47rfc 3 P(fc)) as a function of scale predicted 
by halo models for ST (dot-dashed) and PS (solid) mass functions, using the N ga x(m) given in Eq. (|i^). 
The dashed line shows the mass power spectrum predicted by the fitting formula for the ST case. The 
dotted line shows the power for the ST case if the lower mass cut-off is changed from m cut = lQ^M^/h 
to m cut = 10 115 Mq//i. The bottom panel shows the reduced galaxy bispectrum for equilateral triangles 
as a function of scale for m cu t = 10 1:l Mq//i (solid), m cut = W^^Mq/H (dot-dashed), and for the mass 
(dashed). These assume the ST mass function. 
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Fig. 7. — S p parameters (p — 3, 4, 5 from bottom to top) as a function of smoothing scale R for the mass 
(dashed), galaxies as in Eq.(^6|) with low-mass cutoff at m cu t = IQ 11 Mq/\\ (dot-dashed) and galaxies with 
m cut = 10 n - 5 M Q /h (solid). 
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Fig. 8. — Halo linear (bi, solid lines) and non-linear (62, dashed lines) bias parameters as a function of 
threshold v = 5 c /<j(m) for the PS and ST mass functions. The dot-dashed lines denote b\ — 1 and 62 = 
for comparison. 
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Fig. 9. — The variance of APM galaxies as a function of smoothing scale R (symbols) compared to the 
predictions for galaxies as in Eq.([46|) with m cut = 10 11 A/Q/h (dot-dashed) and galaxies from Eq.(54) (solid) 
with ai = 1.2, 0,2 = 0.9, m cut = 0.8 x 10 10 Mq / h and tuq = 6 x 10 10 Mq/H. Dotted lines show the predictions 
for using Eq. (54) with the parameter values in Eq. (|5?" 
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Fig. 10. — Same as previous figure for the S p parameters. In addition, the dashed lines show the predictions 
for the mass. 



